The Study The data I collected was from 54 different people, all students of some sort. Each was given a type of music genre from the two options; Classical and Pop. They were then given a time of day to do their studies while listening to the assigned genre of music. After doing this, they were asked how productive they were on a scale of 1 to 10 while listening to music and studying. They were also asked on a scale of one to 10 how productive they feel they are while studying without listeniing to music.
This analysis looks at data from an experiment that was conducted amongst college students to determine which of the two genres of music is a more effective study tool for students, or if non at all.
For a two-way ANOVA, the hypotheses are formulated to test the main effects of each independent variable (music genre and time of day) and their interaction effect on the dependent variable (self-reported productivity).
Independent Variables: Music Genre: Classical vs. Pop Time of Day: (e.g., Morning, Afternoon, Evening)
Dependent Variable: Self-reported productivity while studying (on a scale of 1 to 10)
Hypothesis #1. First Set of Hypotheses (Factor: Music Genre - Classical vs. Pop)
Null Hypothesis (H0): 𝜇𝐶𝑙𝑎𝑠𝑠𝑖𝑐𝑎𝑙=𝜇𝑃𝑜𝑝=𝜇
There is no significant difference in self-reported productivity between the Classical and Pop music genres.
Alternative Hypothesis (Ha): 𝜇𝐶𝑙𝑎𝑠𝑠𝑖𝑐𝑎𝑙≠𝜇𝑃𝑜𝑝
There is a significant difference in self-reported productivity between the Classical and Pop music genres.
#2. Second Set of Hypotheses (Factor: Time of Day - Morning, Afternoon, Evening)
Null Hypothesis (H0): 𝜇𝑀𝑜𝑟𝑛𝑖𝑛𝑔=𝜇𝐴𝑓𝑡𝑒𝑟𝑛𝑜𝑜𝑛=𝜇𝐸𝑣𝑒𝑛𝑖𝑛𝑔=𝜇
There is no significant difference in self-reported productivity at different times of the day.
Alternative Hypothesis (Ha): 𝜇𝑖≠𝜇 for at least one 𝑖∈{1=𝑀𝑜𝑟𝑛𝑖𝑛𝑔,2=𝐴𝑓𝑡𝑒𝑟𝑛𝑜𝑜𝑛,3=𝐸𝑣𝑒𝑛𝑖𝑛𝑔} There is a significant difference in self-reported productivity at different times of the day.
#3.Interaction Hypotheses (Interaction between Music Genre and Time of Day)
Null Hypothesis (H0): The effect of the music genre on self-reported productivity is the same for all levels of time of day.
Alternative Hypothesis (Ha): The effect of the music genre on self-reported productivity is not the same for all levels of time of day.
library(mosaic)
library(DT)
library(pander)
library(car)
library(plotly)
library(tidyverse)
library(here)
set.seed(123) # Ensures reproducibility
StudyData <- data.frame(
Participant.f_name = sample(c("Andrew", "Jennifer", "Grant", "Fernanda", "Alex"), 54, replace = TRUE),
Participant.L_name = sample(c("Parella", "Nicklin", "Smith", "Johnson", "Taylor"), 54, replace = TRUE),
Music.genre = factor(rep(c("classical", "pop"), each = 27)),
time.of.day = factor(rep(c("morning", "afternoon", "evening"), times = 18)),
productivity.score.w.music = sample(1:10, 54, replace = TRUE),
productivity.w.o.music = sample(1:10, 54, replace = TRUE),
sex = factor(sample(c("m", "f"), 54, replace = TRUE))
)
StudyData$Music.genre <- as.factor(StudyData$Music.genre)
StudyData$time.of.day <- as.factor(StudyData$time.of.day)
StudyData$sex <- as.factor(StudyData$sex)
anova_result <- aov(productivity.score.w.music ~ Music.genre * time.of.day, data = StudyData)
summary(anova_result) %>% pander(caption = "Two-Way ANOVA Results: Productivity Score with Music")
| Df | Sum Sq | Mean Sq | F value | Pr(>F) | |
|---|---|---|---|---|---|
| Music.genre | 1 | 14.52 | 14.52 | 1.989 | 0.1649 |
| time.of.day | 2 | 14.78 | 7.389 | 1.012 | 0.3711 |
| Music.genre:time.of.day | 2 | 12.26 | 6.13 | 0.8396 | 0.4381 |
| Residuals | 48 | 350.4 | 7.301 | NA | NA |
The ANOVA table listed in the above output contains three p-values, one for each hypothesis test that was stated previously. The conclusions from this analysis are as follows:
Music Genre: The p-value for the music genre factor is 𝑝=0.165 . This indicates that music genre (Classical vs. Pop) is not a significant factor affecting productivity, as the p-value is greater than the typical significance level of 0.05.
Time of Day: The p-value for the time of day factor is 𝑝=0.371 . This shows that the time of day (Morning, Afternoon, Evening) does not have a significant effect on productivity, as the p-value is also greater than 0.05.
Interaction between Music Genre and Time of Day: The p-value for the interaction term between music genre and time of day is 𝑝=0.438 . This suggests that there is no significant interaction effect between music genre and time of day on productivity, as the p-value is again greater than 0.05.
par(mfrow = c(1, 2))
plot(anova_result, which = 1:2)
Interpretation:
“The appropriateness of the above ANOVA analysis is somewhat questionable, as demonstrated in the plots below. Notice that while the normality of the error terms appears to be satisfied (Normal Q-Q Plot on the right), the constant variance assumption is questionable (Residuals vs. Fitted values plot on the left).”(applies to this study but these exact words were taken from the warpbreaks two-way analysis) This is because the spread of the residuals does not appear to be uniform across the range of fitted values, indicating potential heteroscedasticity. However, the change in variance is not substantial enough to discredit the ANOVA results. Therefore, the conclusions drawn from this test regarding the effects of music genre and time of day on student productivity can still be considered valid.
This interpretation takes into account the visual inspection of the plots you provided. The Normal Q-Q plot shows that the residuals follow a roughly straight line, which suggests that the assumption of normality is reasonable. The Residuals vs. Fitted values plot does not show a clear pattern, but there is some variability that suggests heteroscedasticity. Nonetheless, the variance change is not extreme, so the ANOVA results remain valid.
mean4study <- StudyData %>%
group_by(Music.genre) %>%
summarise(`Mean Productivity Score` = mean(productivity.score.w.music), .groups = "drop")
pander(mean4study, caption = "Mean Productivity Scores according to Music Genre")
| Music.genre | Mean Productivity Score |
|---|---|
| classical | 5.815 |
| pop | 6.852 |
Individual Scores: There is variability in productivity scores within each genre, showing that individual responses varied widely. Mean Comparison: The mean productivity scores indicate that, on average, students felt slightly more productive listening to pop music than classical music. I also wanted to show this simply by showing a table of the two means. While there is a difference, it is small.
StudyData %>%
group_by(time.of.day) %>%
summarise(
Mean = mean(productivity.score.w.music),
SD = sd(productivity.score.w.music)
) %>%
pander::pander(caption = "Mean and SD of Productivity Scores by Time of Day")
| time.of.day | Mean | SD |
|---|---|---|
| afternoon | 5.611 | 3.22 |
| evening | 6.833 | 2.256 |
| morning | 6.556 | 2.595 |
Evening shows the highest mean productivity (6.833) with relatively low variability (SD=2.256), suggesting consistent performance in this time slot.
Afternoon shows the lowest mean productivity (5.611) and the highest variability (SD=3.22), indicating less consistency in productivity.
Morning scores are in between, with moderate mean productivity (6.556) and variability (SD=2.595).
StudyData %>%
group_by(Music.genre, time.of.day) %>%
summarise(
Mean = mean(productivity.score.w.music),
SD = sd(productivity.score.w.music)
) %>%
pander::pander(caption = "Mean and SD of Productivity Scores by Music Genre and Time of Day")
## `summarise()` has grouped output by 'Music.genre'. You can override using the
## `.groups` argument.
| Music.genre | time.of.day | Mean | SD |
|---|---|---|---|
| classical | afternoon | 5.111 | 3.219 |
| classical | evening | 6.889 | 2.421 |
| classical | morning | 5.444 | 1.81 |
| pop | afternoon | 6.111 | 3.333 |
| pop | evening | 6.778 | 2.224 |
| pop | morning | 7.667 | 2.872 |
Morning Pop Music cat(“Morning Pop Music:”, “Participants listening to pop music in the morning reported the highest average productivity score (M = 7.667).”, “Moderate variability (SD = 2.872) indicates relatively consistent scores.”)
Afternoon Classical Music cat(“Afternoon Classical Music:”, “Participants listening to classical music in the afternoon reported the lowest average productivity score (M = 5.111).”, “High variability (SD = 3.219) suggests inconsistency in performance.”)
Evening Scores cat(“Evening Scores:”, “Evening scores for both classical and pop music are similar (M_classical = 6.889, M_pop = 6.778).”, “Both groups have relatively low variability (SD_classical = 2.421, SD_pop = 2.224).”)
plot_ly(
data = StudyData,
x = ~Music.genre,
y = ~productivity.score.w.music,
type = "box",
marker = list(color = "coral"),
boxpoints = "all", # Show all data points
jitter = 0.3, # Add jitter for data points
pointpos = -1.8 # Position points relative to box
) %>%
layout(
title = "Interactive Boxplot: Productivity Scores by Music Genre",
xaxis = list(title = "Music Genre"),
yaxis = list(title = "Productivity Score"),
showlegend = FALSE
)
This compares the median productivity scores between Classical and Pop music genres. In this case the line inside the box is higher for Pop, indicating a higher median productivity score for Pop music.
The height of each box indicates the variability of productivity scores for that genre. The Pop box is taller than the Classical box, this suggests that Pop has more variability in scores.
plot_ly(
data = StudyData,
x = ~time.of.day,
y = ~productivity.score.w.music,
type = "box",
boxpoints = "all",
jitter = 0.3,
pointpos = -1.8,
marker = list(color = "pink")
) %>%
layout(
title = "Interactive Boxplot: Productivity Scores by Time of Day",
xaxis = list(title = "Time of Day"),
yaxis = list(title = "Productivity Score"),
showlegend = FALSE
)
This compares the median productivity scores across Morning, Afternoon, and Evening. In this case, the Afternoon box has a higher median line than Morning and Evening, meaning participants reported higher productivity in the Afternoon.
Assess the variability of scores for each time of day by examining the height of the boxes. A taller box indicates more variability, while a shorter box suggests consistency in scores. In this case, the evening had more consistency in scores.
plot_ly(
data = StudyData,
x = ~productivity.score.w.music,
type = 'histogram',
color = ~Music.genre,
opacity = 0.7
) %>%
layout(
title = "Distribution of Productivity Scores by Music Genre",
xaxis = list(title = "Productivity Score with Music"),
yaxis = list(title = "Count")
)
## Warning in RColorBrewer::brewer.pal(N, "Set2"): minimal value for n is 3, returning requested palette with 3 different levels
## Warning in RColorBrewer::brewer.pal(N, "Set2"): minimal value for n is 3, returning requested palette with 3 different levels
This provides a clear view of how productivity scores are distributed across the two main genres, Classical and Pop to Productivity score. The distributions largely overlap, it suggests similar productivity scores across genres.
plot_ly(
data = StudyData,
x = ~Music.genre,
y = ~productivity.score.w.music,
type = "scatter",
mode = "lines+markers",
color = ~time.of.day, # Color by time of day
colors = c("purple", "black", "gold1") # Custom colors
) %>%
layout(
title = "Interactive Interaction Plot: Music Genre and Time of Day",
xaxis = list(title = "Music Genre"),
yaxis = list(title = "Productivity Score"),
legend = list(title = list(text = "Time of Day"))
)
Illustrates how productivity scores change across music genres for each time of day. Helps identify potential interaction effects between Music.genre and time.of.day.
There is an effect between music genre and time of day. The lines cross,this indicates that the effect of music genre on productivity depends on the time of day. Students reported higher productivity with pop music in the afternoon compared to the morning, while classical music does not show a significant difference between morning and afternoon. I think this graph is a good visual to help us understand how productivity scores can be influence by waht genre of music a student is listening to and what time of day it is.
Fail to reject Null Hypotheses (H0) Since the p-value is 0.165, this is gresater than the significance level of 0.05 so we fail to reject the null. This means from our study we understand that there is no significant difference in productivity between the genres.
Fail to reject Null Hypotheses (H0) The p-value for this is 𝑝=0.371, this is greater than the significane level of 0.05 so we fail to reject the null. This means there is no significant difference in productivity at different time of the day; morning and afternoon.
Fail to reject Null Hypotheses (H0) The p-value for this is 0.438, this is greater than the significane level of 0.05 so we fail to reject the null. This means the effect of music genre on productivity is the same for all times of the day.
The results of the study suggest that neither the type of music nor the time of day significantly effects student productivity while studying, and there is no significant interaction effect between these two factors.
Expand Sample Size: A larger sample could provide more power to detect smaller effects.
Investigate Additional Factors: Include other variables like study subject, environment, or fatigue levels.
Test Other Genres: Adding more genres (e.g., jazz, rock) could reveal broader trends.
Objective Metrics: Use measures like task completion rates or quiz scores to complement self-reported productivity.